In this part of the workshop we will talk about raster data. We will
start with an introduction of the fundamental principles, packages, and
metadata needed to work with raster data in R. We will discuss some of
the most important metadata elements, including CRS and resolution. We
continue to work with the tidyverse and here
packages and we will use two additional packages to work with raster
data: raster and rgdal.
The dataset we chose is a set of GeoTIFF files on Delft or subdivisions of it. The files were included in the archive you downloaded before the workshop.
The GeoTIFF format contains a set of embedded tags with metadata
about the raster data. We can use the function GDALinfo() from the
rgdal package to get information about our raster data
before we read that data into R. It is recommended to do this before
importing your data.
GDALinfo(here("data","tud-dsm.tif"))
## rows 3857
## columns 7221
## bands 1
## lower left origin.x 83569.5
## lower left origin.y 445251.5
## res.x 0.5
## res.y 0.5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/tud-dsm.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float32 FALSE 0 1 7221
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 -4294967295 4294967295 NA NA
## Metadata:
## AREA_OR_POINT=Area
Now that we’ve previewed the metadata for our GeoTIFF, let’s import
this raster dataset. We are going to work with a Digital Surface Model
(DSM) which is in the GeoTIFF format. We can use the
raster() function to import a raster file.
DSM_TUD <- raster(here("data","tud-dsm.tif"))
DSM_TUD
## class : RasterLayer
## dimensions : 3857, 7221, 27851397 (nrow, ncol, ncell)
## resolution : 0.5, 0.5 (x, y)
## extent : 83569.5, 87180, 445251.5, 447180 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tud-dsm.tif
## names : tud.dsm
Similar to other data structures in R like vectors and data frame
columns, descriptive statistics for raster data can be retreived with
the summary() function.
summary(DSM_TUD)
## tud.dsm
## Min. -5.35300
## 1st Qu. -0.92200
## Median 0.20100
## 3rd Qu. 5.03825
## Max. 87.37800
## NA's 0.00000
But note the warning. Unless you force R to calculate these
statistics using every cell, it will take a random sample of 100,000
cells and calculate from them instead. Now, raster objects are not data
frames so you cannot count the cells with nrow(), but with
the ncell() function of the raster
package.
summary(DSM_TUD, maxsamp = ncell(DSM_TUD))
## tud.dsm
## Min. -9.375
## 1st Qu. -0.919
## Median 0.197
## 3rd Qu. 5.055
## Max. 93.652
## NA's 0.000
To visualise raster data in ggplot2, we will need to
convert it to a data frame. The raster package has a
as.data.frame function for that purpose.
DSM_TUD_df <- as.data.frame(DSM_TUD, xy = TRUE)
Now when we view the structure of our data, we will see a standard dataframe format.
str(DSM_TUD_df)
## 'data.frame': 27851397 obs. of 3 variables:
## $ x : num 83570 83570 83571 83571 83572 ...
## $ y : num 447180 447180 447180 447180 447180 ...
## $ tud.dsm: num 10.84 10.4 9.91 9.46 9.03 ...
We can use ggplot() to plot this data with another
geom_ function called geom_raster(). The
coord_quickmap() gives a quick approximation that preserves
straight lines. This approximation is suitable for small areas that are
not too close to the poles.
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = tud.dsm)) +
scale_fill_viridis_c() + # remember, the this color palette was introduced in the first lesson
coord_quickmap()
These map, a so-called Digital Surface Model, shows the elevation of our study site, including buildings and vegetation.
For faster previews, you can use the plot()
function.
plot(DSM_TUD)
But what units are these? This information is specified in the CRS, so let’s have a closer look at the CRS of our raster.
With raster objects we use the crs() function from the
raster package.
crs(DSM_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["Amersfoort / RD New",
## BASEGEOGCRS["Amersfoort",
## DATUM["Amersfoort",
## ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4289]],
## CONVERSION["RD New",
## METHOD["Oblique Stereographic",
## ID["EPSG",9809]],
## PARAMETER["Latitude of natural origin",52.1561605555556,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",5.38763888888889,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9999079,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",155000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",463000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Engineering survey, topographic mapping."],
## AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
## BBOX[50.75,3.2,53.7,7.22]],
## ID["EPSG",28992]]
We see that the units are in meters in the Proj4 string:
+units=m.
It is useful to know the minimum and maximum values of a raster dataset. In the case of a DSM, those values represent the min/max elevation range of our site.
minValue(DSM_TUD)
## [1] NA
maxValue(DSM_TUD)
## [1] NA
If Min and Max values haven’t been calculated, you can set them with
the raster::setMinMax() function.
DSM_TUD <- raster::setMinMax(DSM_TUD)
To see how many bands a raster dataset has, use the
raster::nlayers() function.
nlayers(DSM_TUD)
## [1] 1
This dataset has only 1 band. We will discuss multi-band raster data in a later episode.
A histogram can be used to inspect the distribution of raster values
visually. It can show if there are values above the max or below the min
of the expected range. We can create a histogram with the ggplot2
function geom_histogram().
ggplot() +
geom_histogram(data = DSM_TUD_df, aes(tud.dsm))
Adjust the level of desired detail by setting the number of bins.
ggplot() +
geom_histogram(data = DSM_TUD_df, aes(tud.dsm), bins = 40)
Looking at the distribution can help us identify bad data values, that is, values that are out of the min/max range.
Use GDALinfo() to determine the following about the
tud-dsm-hill.tif file:
DSM_TUD?GDALinfo(here("data","tud-dsm-hill.tif"))
## rows 3857
## columns 7221
## bands 1
## lower left origin.x 83569.5
## lower left origin.y 445251.5
## res.x 0.5
## res.y 0.5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/tud-dsm-hill.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float32 TRUE -9999 1 7221
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 -4294967295 4294967295 NA NA
## Metadata:
## AREA_OR_POINT=Area
In this part we will plot our raster object using
ggplot2 with customized coloring schemes. In the previous
plot, our DSM was colored with a continuous colour range. For clarity
and visibility, we may prefer to view the data “symbolized” or colored
according to ranges of values. This is comparable to a “classified” map.
For that, we need to tell ggplot how many groups to break
our data into and where those breaks should be. To make these decisions,
it is useful to first explore the distribution of the data using a bar
plot. To begin with, we will use dplyr’s mutate() function
combined with cut() to split the data into 3 bins.
DSM_TUD_df <- DSM_TUD_df %>%
mutate(fct_elevation = cut(tud.dsm, breaks = 3))
ggplot() +
geom_bar(data = DSM_TUD_df, aes(fct_elevation))
To see the cutoff values:
unique(DSM_TUD_df$fct_elevation)
## [1] (-9.48,25] (25,59.3] (59.3,93.8]
## Levels: (-9.48,25] (25,59.3] (59.3,93.8]
To show count number of pixels in each group:
DSM_TUD_df %>%
group_by(fct_elevation) %>%
count()
## # A tibble: 3 × 2
## # Groups: fct_elevation [3]
## fct_elevation n
## <fct> <int>
## 1 (-9.48,25] 27676535
## 2 (25,59.3] 167737
## 3 (59.3,93.8] 7125
To customize cutoff values:
custom_bins <- c(-10, 0, 5, 100)
head(DSM_TUD_df)
## x y tud.dsm fct_elevation
## 1 83569.75 447179.8 10.840 (-9.48,25]
## 2 83570.25 447179.8 10.399 (-9.48,25]
## 3 83570.75 447179.8 9.909 (-9.48,25]
## 4 83571.25 447179.8 9.460 (-9.48,25]
## 5 83571.75 447179.8 9.027 (-9.48,25]
## 6 83572.25 447179.8 8.950 (-9.48,25]
DSM_TUD_df <- DSM_TUD_df %>%
mutate(fct_elevation_cb = cut(tud.dsm, breaks = custom_bins))
head(DSM_TUD_df)
## x y tud.dsm fct_elevation fct_elevation_cb
## 1 83569.75 447179.8 10.840 (-9.48,25] (5,100]
## 2 83570.25 447179.8 10.399 (-9.48,25] (5,100]
## 3 83570.75 447179.8 9.909 (-9.48,25] (5,100]
## 4 83571.25 447179.8 9.460 (-9.48,25] (5,100]
## 5 83571.75 447179.8 9.027 (-9.48,25] (5,100]
## 6 83572.25 447179.8 8.950 (-9.48,25] (5,100]
unique(DSM_TUD_df$fct_elevation_cb)
## [1] (5,100] (0,5] (-10,0]
## Levels: (-10,0] (0,5] (5,100]
ggplot() +
geom_bar(data = DSM_TUD_df, aes(fct_elevation_cb))
DSM_TUD_df %>%
group_by(fct_elevation_cb) %>%
count()
## # A tibble: 3 × 2
## # Groups: fct_elevation_cb [3]
## fct_elevation_cb n
## <fct> <int>
## 1 (-10,0] 12953410
## 2 (0,5] 7891233
## 3 (5,100] 7006754
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y, fill = fct_elevation_cb)) +
coord_quickmap()
The plot above uses the default colors inside ggplot for raster
objects. We can specify our own colors to make the plot look a little
nicer. R has a built in set of colors for plotting terrain, which are
built in to the terrain.colors() function. Since we have
three bins, we want to create a 3-color palette:
terrain.colors(3)
## [1] "#00A600" "#ECB176" "#F2F2F2"
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y,
fill = fct_elevation_cb)) +
scale_fill_manual(values = terrain.colors(3)) +
coord_quickmap()
my_col <- terrain.colors(3)
ggplot() +
geom_raster(data = DSM_TUD_df , aes(x = x, y = y,
fill = fct_elevation_cb)) +
scale_fill_manual(values = my_col, name = "Elevation") +
coord_quickmap()
Create a plot of the TU Delft Digital Surface Model
(DSM_TUD) that has:
DSM_TUD_df <- DSM_TUD_df %>%
mutate(fct_elevation_6 = cut(tud.dsm, breaks = 6))
unique(DSM_TUD_df$fct_elevation_6)
## [1] (7.8,25] (-9.48,7.8] (25,42.1] (42.1,59.3] (76.5,93.8] (59.3,76.5]
## 6 Levels: (-9.48,7.8] (7.8,25] (25,42.1] (42.1,59.3] ... (76.5,93.8]
my_col <- terrain.colors(6)
ggplot() +
geom_raster(data = DSM_TUD_df, aes(x = x, y = y,
fill = fct_elevation_6)) +
scale_fill_manual(values = my_col, name = "Elevation") +
coord_quickmap() +
xlab("X") +
ylab("Y") +
labs(title = "Elevation Classes of the Digital Surface Model (DSM)")
We can layer a raster on top of a hillshade raster for the same area, and use a transparency factor to create a 3-dimensional shaded effect. A hillshade is a raster that maps the shadows and texture that you would see from above when viewing terrain. We will add a custom color, making the plot grey.
DSM_hill_TUD <- raster(here("data","tud-dsm-hill.tif"))
DSM_hill_TUD
## class : RasterLayer
## dimensions : 3857, 7221, 27851397 (nrow, ncol, ncell)
## resolution : 0.5, 0.5 (x, y)
## extent : 83569.5, 87180, 445251.5, 447180 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tud-dsm-hill.tif
## names : tud.dsm.hill
DSM_hill_TUD_df <- as.data.frame(DSM_hill_TUD, xy = TRUE)
str(DSM_hill_TUD_df)
## 'data.frame': 27851397 obs. of 3 variables:
## $ x : num 83570 83570 83571 83571 83572 ...
## $ y : num 447180 447180 447180 447180 447180 ...
## $ tud.dsm.hill: num 51.7 48.7 51.4 62.1 110.4 ...
ggplot() +
geom_raster(data = DSM_hill_TUD_df,
aes(x = x, y = y, alpha = tud.dsm.hill)) +
scale_alpha(range = c(0.15, 0.65), guide = "none") +
coord_quickmap()
We can layer another raster on top of our hillshade by adding another
call to the geom_raster() function. Let’s overlay DSM_TUD
on top of the DSM_hill_TUD.
ggplot() +
geom_raster(data = DSM_TUD_df ,
aes(x = x, y = y,
fill = tud.dsm)) +
geom_raster(data = DSM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dsm.hill)) +
scale_fill_viridis_c() +
scale_alpha(range = c(0.15, 0.65), guide = "none") +
ggtitle("Elevation with hillshade") +
coord_quickmap()
Use the tud-dtm.tif and tud-dtm-hill.tif
files from the data directory to create a Digital Terrain
Model map of the TU Delft area.
Make sure to:
# import DTM
DTM_TUD <- raster(here("data","tud-dtm.tif"))
DTM_TUD_df <- as.data.frame(DTM_TUD, xy = TRUE)
# DTM Hillshade
DTM_hill_TUD <- raster(here("data","tud-dtm-hill.tif"))
DTM_hill_TUD_df <- as.data.frame(DTM_hill_TUD, xy = TRUE)
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y,
fill = tud.dtm,
alpha = 2.0)
) +
geom_raster(data = DTM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dtm.hill)
) +
scale_fill_viridis_c() +
guides(fill = guide_colorbar()) +
scale_alpha(range = c(0.4, 0.7), guide = "none") +
theme_bw() +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
theme(axis.title.x = element_blank(),
axis.title.y = element_blank()) +
ggtitle("DTM with Hillshade") +
coord_quickmap()
What happens when maps don’t line up? That is usually a sign that layers are in different coordinate reference systems (CRS).
In this episode, we will be working with the digital terrain model.
DTM_TUD <- raster(here("data","tud-dtm.tif"))
DTM_hill_TUD <- raster(here("data","tud-dtm-hill-ETRS89.tif"))
DTM_TUD_df <- as.data.frame(DTM_TUD, xy = TRUE)
DTM_hill_TUD_df <- as.data.frame(DTM_hill_TUD, xy = TRUE)
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y,
fill = tud.dtm)) +
geom_raster(data = DTM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dtm.hill.ETRS89)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
Our results are curious - neither the Digital Terrain Model
(DTM_TUD_df) nor the DTM Hillshade
(DTM_hill_TUD_df) plotted. Let’s try to plot the DTM on its
own to make sure there are data there.
ggplot() +
geom_raster(data = DTM_TUD_df,
aes(x = x, y = y,
fill = tud.dtm)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
ggplot() +
geom_raster(data = DTM_hill_TUD_df,
aes(x = x, y = y,
alpha = tud.dtm.hill.ETRS89)) +
coord_quickmap()
If we look at the axes, we can see that the projections of the two rasters are different.
View the CRS for each of these two datasets. What projection does each use?
crs(DTM_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["Amersfoort / RD New",
## BASEGEOGCRS["Amersfoort",
## DATUM["Amersfoort",
## ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4289]],
## CONVERSION["RD New",
## METHOD["Oblique Stereographic",
## ID["EPSG",9809]],
## PARAMETER["Latitude of natural origin",52.1561605555556,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",5.38763888888889,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9999079,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",155000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",463000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Engineering survey, topographic mapping."],
## AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
## BBOX[50.75,3.2,53.7,7.22]],
## ID["EPSG",28992]]
crs(DTM_hill_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation: +proj=longlat +ellps=GRS80 +no_defs
## WKT2 2019 representation:
## GEOGCRS["ETRS89 (with axis order normalized for visualization)",
## DATUM["European Terrestrial Reference System 1989",
## ELLIPSOID["GRS 1980",6378137,298.257222101004,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]],
## REMARK["Axis order reversed compared to EPSG:4258"]]
We can use the projectRaster() function to reproject a
raster into a new CRS. Keep in mind that reprojection only works when
you first have a defined CRS for the raster object that you want to
reproject. It cannot be used if no CRS is defined.
DTM_hill_EPSG28992_TUD <- projectRaster(DTM_hill_TUD,
crs = crs(DTM_TUD))
crs(DTM_hill_EPSG28992_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation:
## +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## WKT2 2019 representation:
## PROJCRS["Amersfoort / RD New",
## BASEGEOGCRS["Amersfoort",
## DATUM["Amersfoort",
## ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4289]],
## CONVERSION["RD New",
## METHOD["Oblique Stereographic",
## ID["EPSG",9809]],
## PARAMETER["Latitude of natural origin",52.1561605555556,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",5.38763888888889,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",0.9999079,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",155000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",463000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Engineering survey, topographic mapping."],
## AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
## BBOX[50.75,3.2,53.7,7.22]],
## ID["EPSG",28992]]
crs(DTM_hill_TUD)
## Coordinate Reference System:
## Deprecated Proj.4 representation: +proj=longlat +ellps=GRS80 +no_defs
## WKT2 2019 representation:
## GEOGCRS["ETRS89 (with axis order normalized for visualization)",
## DATUM["European Terrestrial Reference System 1989",
## ELLIPSOID["GRS 1980",6378137,298.257222101004,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433,
## ID["EPSG",9122]]],
## REMARK["Axis order reversed compared to EPSG:4258"]]
extent(DTM_hill_EPSG28992_TUD)
## class : Extent
## xmin : 83539.42
## xmax : 87208.82
## ymin : 445198.4
## ymax : 447232.9
extent(DTM_hill_TUD)
## class : Extent
## xmin : 4.346815
## xmax : 4.399776
## ymin : 51.99105
## ymax : 52.00884
Why do you think the two extents differ?
res(DTM_hill_EPSG28992_TUD)
## [1] 0.493 0.507
res(DTM_TUD)
## [1] 0.5 0.5
These two resolutions are different, but they’re representing the
same data. We can tell R to force our newly reprojected raster to be the
same as DTM_TUD by using res(DTM_TUD).
DTM_hill_EPSG28992_TUD <- projectRaster(DTM_hill_TUD,
crs = crs(DTM_TUD),
res = res(DTM_TUD))
res(DTM_hill_EPSG28992_TUD)
## [1] 0.5 0.5
res(DTM_TUD)
## [1] 0.5 0.5
DTM_hill_TUD_2_df <- as.data.frame(DTM_hill_EPSG28992_TUD, xy = TRUE)
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y,
fill = tud.dtm)) +
geom_raster(data = DTM_hill_TUD_2_df,
aes(x = x, y = y,
alpha = tud.dtm.hill.ETRS89)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
We often want to perform calculations on two or more rasters to create a new output raster. For example, if we are interested in mapping the heights of trees and buildings across an entire field site, we might want to calculate the difference between the Digital Surface Model (DSM, tops of trees and buildings) and the Digital Terrain Model (DTM, ground level). The resulting dataset is referred to as a Canopy Height Model (CHM) and represents the actual height of trees, buildings, etc. with the influence of ground elevation removed.
GDALinfo(here("data","tud-dtm.tif"))
## rows 3857
## columns 7221
## bands 1
## lower left origin.x 83569.5
## lower left origin.y 445251.5
## res.x 0.5
## res.y 0.5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/tud-dtm.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float32 FALSE 0 1 7221
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 -4294967295 4294967295 NA NA
## Metadata:
## AREA_OR_POINT=Area
GDALinfo(here("data","tud-dsm.tif"))
## rows 3857
## columns 7221
## bands 1
## lower left origin.x 83569.5
## lower left origin.y 445251.5
## res.x 0.5
## res.y 0.5
## ysign -1
## oblique.x 0
## oblique.y 0
## driver GTiff
## projection +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## file /Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/tud-dsm.tif
## apparent band summary:
## GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float32 FALSE 0 1 7221
## apparent band statistics:
## Bmin Bmax Bmean Bsd
## 1 -4294967295 4294967295 NA NA
## Metadata:
## AREA_OR_POINT=Area
ggplot() +
geom_raster(data = DTM_TUD_df ,
aes(x = x, y = y, fill = tud.dtm)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
ggplot() +
geom_raster(data = DSM_TUD_df ,
aes(x = x, y = y, fill = tud.dsm)) +
scale_fill_gradientn(name = "Elevation", colors = terrain.colors(10)) +
coord_quickmap()
CHM_TUD <- DSM_TUD - DTM_TUD
CHM_TUD_df <- as.data.frame(CHM_TUD, xy = TRUE)
ggplot() +
geom_raster(data = CHM_TUD_df ,
aes(x = x, y = y, fill = layer)) +
scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) +
coord_quickmap()
ggplot(CHM_TUD_df) +
geom_histogram(aes(layer))
It’s often a good idea to explore the range of values in a raster dataset just like we might explore a dataset that we collected in the field.
CHM_TUD that we just created?CHM_TUD?min(CHM_TUD_df$layer, na.rm = TRUE)
## [1] -9.148
max(CHM_TUD_df$layer, na.rm = TRUE)
## [1] 93.652
ggplot(CHM_TUD_df) +
geom_histogram(aes(layer))
ggplot(CHM_TUD_df) +
geom_histogram(aes(layer), colour="black",
fill="darkgreen", bins = 6)
custom_bins <- c(0, 10, 20, 30, 100)
CHM_TUD_df <- CHM_TUD_df %>%
mutate(canopy_discrete = cut(layer, breaks = custom_bins))
ggplot() +
geom_raster(data = CHM_TUD_df , aes(x = x, y = y,
fill = canopy_discrete)) +
scale_fill_manual(values = terrain.colors(4)) +
coord_quickmap()
writeRaster(CHM_TUD, here("fig_output","CHM_TUD.tiff"),
format="GTiff",
overwrite=TRUE,
NAflag=-9999)
In this episode, the multi-band data that we are working with is high resolution imagery for the Netherlands.
The raster() function only reads in the first band, in
this case the red band of an RGB raster.
RGB_band1_TUD <- raster(here("data","tudlib-rgb.tif"))
RGB_band1_TUD_df <- as.data.frame(RGB_band1_TUD, xy = TRUE)
ggplot() +
geom_raster(data = RGB_band1_TUD_df,
aes(x = x, y = y, alpha = tudlib.rgb)) +
coord_quickmap() # use `coord_equal()` instead
RGB_band1_TUD
## class : RasterLayer
## band : 1 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb
## values : 0, 255 (min, max)
nbands(RGB_band1_TUD)
## [1] 3
To import the green band:
RGB_band2_TUD <- raster(here("data","tudlib-rgb.tif"), band = 2)
RGB_band2_TUD_df <- as.data.frame(RGB_band2_TUD, xy = TRUE)
ggplot() +
geom_raster(data = RGB_band2_TUD_df,
aes(x = x, y = y, alpha = tudlib.rgb)) +
coord_equal()
There is a better way of reading in all bands. The
stack() function brings in all bands
RGB_stack_TUD <- stack(here("data","tudlib-rgb.tif"))
RGB_stack_TUD
## class : RasterStack
## dimensions : 4988, 4866, 24271608, 3 (nrow, ncol, ncell, nlayers)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## names : tudlib.rgb.1, tudlib.rgb.2, tudlib.rgb.3
## min values : 0, 0, 0
## max values : 255, 255, 255
RGB_stack_TUD@layers
## [[1]]
## class : RasterLayer
## band : 1 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb.1
## values : 0, 255 (min, max)
##
##
## [[2]]
## class : RasterLayer
## band : 2 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb.2
## values : 0, 255 (min, max)
##
##
## [[3]]
## class : RasterLayer
## band : 3 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb.3
## values : 0, 255 (min, max)
RGB_stack_TUD[[2]]
## class : RasterLayer
## band : 2 (of 3 bands)
## dimensions : 4988, 4866, 24271608 (nrow, ncol, ncell)
## resolution : 0.08, 0.08 (x, y)
## extent : 85272, 85661.28, 446295.2, 446694.2 (xmin, xmax, ymin, ymax)
## crs : +proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## source : tudlib-rgb.tif
## names : tudlib.rgb.2
## values : 0, 255 (min, max)
RGB_stack_TUD_df <- as.data.frame(RGB_stack_TUD, xy = TRUE)
str(RGB_stack_TUD_df)
## 'data.frame': 24271608 obs. of 5 variables:
## $ x : num 85272 85272 85272 85272 85272 ...
## $ y : num 446694 446694 446694 446694 446694 ...
## $ tudlib.rgb.1: int 52 48 47 49 47 45 47 48 49 54 ...
## $ tudlib.rgb.2: int 64 58 57 60 57 55 58 59 62 69 ...
## $ tudlib.rgb.3: int 57 49 49 53 50 47 51 54 57 68 ...
ggplot() +
geom_histogram(data = RGB_stack_TUD_df, aes(tudlib.rgb.1))
ggplot() +
geom_raster(data = RGB_stack_TUD_df,
aes(x = x, y = y, alpha = tudlib.rgb.2)) +
coord_equal()
To create an RGB image, we will use the `plotRGB
function from the raster package.
plotRGB(RGB_stack_TUD,
r = 1, g = 2, b = 3)
The image above looks pretty good. We can explore whether applying a
stretch to the image might improve clarity and contrast using
stretch="lin" or stretch="hist".
plotRGB(RGB_stack_TUD,
r = 1, g = 2, b = 3,
scale = 800,
stretch = "lin")
plotRGB(RGB_stack_TUD,
r = 1, g = 2, b = 3,
scale = 800,
stretch = "hist")
The R RasterStack and RasterBrick object types can both store multiple bands. However, how they store each band is different. The bands in a RasterStack are stored as links to raster data that is located somewhere on our computer. A RasterBrick contains all of the objects stored within the actual R object. In most cases, we can work with a RasterBrick in the same way we might work with a RasterStack. However a RasterBrick is often more efficient and faster to process - which is important when working with larger files.
object.size(RGB_stack_TUD)
## 51048 bytes
RGB_brick_TUD <- brick(RGB_stack_TUD)
object.size(RGB_brick_TUD)
## 291274664 bytes
plotRGB(RGB_brick_TUD)